在本文中,我们提出了与IEEE计算机协会在CVPR 2022上同时与IEEE计算机协会研讨会同时举行的多手术检测挑战。我们的多手术检测挑战旨在检测自动图像操作,包括但不限于图像编辑,图像合成,图像合成,图像,图像,图像,图像合成,图像,图像编辑一代,图像Photoshop等。我们的挑战吸引了来自世界各地的674支团队,约有2000个有效的结果提交数量。我们邀请了前十支球队为挑战提供解决方案,其中三支球队在大结局中获得了奖项。在本文中,我们介绍了前三名团队的解决方案,以增强图像伪造检测领域的研究工作。
translated by 谷歌翻译
口语理解(SLU)将自动语音识别(ASR)和自然语言理解(NLU)视为一项统一任务,通常遭受数据稀缺。我们基于元辅助学习来利用ASR和NLU联合培训方法,通过仅利用大量的语音数据来提高低资源SLU任务的性能。这种方法的一个明显优势是,它提供了一个灵活的框架来实施低资源的SLU训练任务,而无需访问任何进一步的语义注释。特别是,NLU模型被视为标签生成网络,以预测文本的意图和插槽标签。多任务网络网络从语音同步训练ASR任务和SLU任务;标签生成网络的预测作为语义目标传递到多任务网络。通过公共CATSLU数据集的实验证明了所提出的算法的效率,该数据集对下游NLU任务产生了更合适的ASR假设。
translated by 谷歌翻译
磁共振成像(MRI)是一种重要的非侵入性临床工具,可以产生高分辨率和可重复的图像。然而,高质量的MR图像需要长时间的扫描时间,这导致患者的疲惫和不适,由于患者的自愿运动和非自愿的生理运动,诱导更多人工制品。为了加速扫描过程,通过K空间欠采样和基于深度学习的重建的方法已经推广。这项工作引进了SwinMR,这是一种基于新型的Swin变压器的快速MRI重建方法。整个网络由输入模块(IM)组成,特征提取模块(FEM)和输出模块(OM)。 IM和OM是2D卷积层,并且FEM由级联的残留的Swin变压器块(RSTBS)和2D卷积层组成。 RSTB由一系列SWIN变压器层(STL)组成。 STL的Shifted Windows多头自我关注(W-MSA / SW-MSA)在移位的窗口中执行,而不是整个图像空间中原始变压器的多头自我关注(MSA)。通过使用灵敏度图提出了一种新的多通道损耗,这被证明是为了保留更多纹理和细节。我们在Calgary-Campinas公共大脑MR DataSet中进行了一系列比较研究和消融研究,并在多模态脑肿瘤细分挑战2017年数据集中进行了下游分段实验。结果表明,与其他基准方法相比,我们的SwinMR实现了高质量的重建,并且它在噪音中断和不同的数据集中显示了不同的遮光罩掩模的稳健性。该代码在https://github.com/ayanglab/swinmr公开使用。
translated by 谷歌翻译
图像恢复算法(如超分辨率(SR)都是用于在劣化图像中的对象检测的必不可少的预处理模块。然而,大多数这些算法假设劣化是固定的并且已知先验。当真实劣化未知或与假设不同时,预处理模块和随后的高级任务(如对象检测)将失败。在这里,我们提出了一种新颖的框架,重新定位,以检测降低的低分辨率图像中的对象。 Restoredet利用下采样的降级作为自我监督信号的一种转换,以探索针对各种分辨率和其他降级条件的等分性表示。具体地,我们通过从一对原始和随机降级的图像编码和解码劣化转换来学习这种内在视觉结构。该框架可以进一步利用先进的SR架构的优点,该架构具有任意分辨率还原解码器以重建来自劣化的输入图像的原始对应关系。代表学习和对象检测都以端到端的培训方式共同优化。 Restoredet是一个通用框架,可以在任何主流对象检测架构上实现。广泛的实验表明,与在面对变体退化情况时,我们基于Centernet的框架已经实现了卓越的性能。我们的代码即将发布。
translated by 谷歌翻译
加权最近的邻居(WNN)估计量通常用作平均回归估计的灵活且易于实现的非参数工具。袋装技术是一种优雅的方式,可以自动生成最近邻居的重量的WNN估计器;我们将最终的估计量命名为分布最近的邻居(DNN),以便于参考。然而,这种估计器缺乏分布结果,从而将其应用于统计推断。此外,当平均回归函数具有高阶平滑度时,DNN无法达到最佳的非参数收敛率,这主要是由于偏差问题。在这项工作中,我们对DNN提供了深入的技术分析,我们建议通过线性将两个DNN估计量与不同的子采样量表进行线性相结合,从而提出了DNN估计量的偏差方法,从而导致新型的两尺度DNN(TDNN(TDNN) )估计器。两尺度的DNN估计量具有等效的WNN表示,重量承认明确形式,有些则是负面的。我们证明,由于使用负权重,两尺度DNN估计器在四阶平滑度条件下估算回归函数时享有最佳的非参数收敛速率。我们进一步超出了估计,并确定DNN和两个规模的DNN均无渐进地正常,因为亚次采样量表和样本量差异到无穷大。对于实际实施,我们还使用二尺度DNN的Jacknife和Bootstrap技术提供方差估计器和分配估计器。可以利用这些估计器来构建有效的置信区间,以用于回归函数的非参数推断。建议的两尺度DNN方法的理论结果和吸引人的有限样本性能用几个数值示例说明了。
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译
Hybrid unmanned aerial vehicles (UAVs) integrate the efficient forward flight of fixed-wing and vertical takeoff and landing (VTOL) capabilities of multicopter UAVs. This paper presents the modeling, control and simulation of a new type of hybrid micro-small UAVs, coined as lifting-wing quadcopters. The airframe orientation of the lifting wing needs to tilt a specific angle often within $ 45$ degrees, neither nearly $ 90$ nor approximately $ 0$ degrees. Compared with some convertiplane and tail-sitter UAVs, the lifting-wing quadcopter has a highly reliable structure, robust wind resistance, low cruise speed and reliable transition flight, making it potential to work fully-autonomous outdoor or some confined airspace indoor. In the modeling part, forces and moments generated by both lifting wing and rotors are considered. Based on the established model, a unified controller for the full flight phase is designed. The controller has the capability of uniformly treating the hovering and forward flight, and enables a continuous transition between two modes, depending on the velocity command. What is more, by taking rotor thrust and aerodynamic force under consideration simultaneously, a control allocation based on optimization is utilized to realize cooperative control for energy saving. Finally, comprehensive Hardware-In-the-Loop (HIL) simulations are performed to verify the advantages of the designed aircraft and the proposed controller.
translated by 谷歌翻译